Data Visualization

Incremental Learning in Diagonal Linear Networks

Updated: 2023-07-31 22:55:03

Diagonal linear networks (DLNs) are a toy simplification of artificial neural networks; they consist in a quadratic reparametrization of linear regression inducing a sparse implicit regularization. In this paper, we describe the trajectory of the gradient flow of DLNs in the limit of small initialization. We show that incremental learning is effectively performed in the limit: coordinates are successively activated, while the iterate is the minimizer of the loss constrained to have support on the active coordinates only. This shows that the sparse implicit regularization of DLNs decreases with time. This work is restricted to the underparametrized regime with anti-correlated features for technical reasons.

Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Compute-Efficient Deep Learning : Algorithmic Trends and Opportunities Brian R . Bartoldson , Bhavya Kailkhura , Davis Blalock 24(122 1 77, 2023. Abstract Although deep learning has made great progress in recent years , the exploding economic and environmental costs of training neural networks are becoming unsustainable . To address this problem , there has been a great deal of research on algorithmically-efficient deep learning which seeks to reduce training costs not at the hardware or implementation level , but through changes in the semantics of the training program . In this paper , we present

Maximum likelihood estimation in Gaussian process regression is ill-posed

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Maximum likelihood estimation in Gaussian process regression is ill-posed Toni Karvonen , Chris J . Oates 24(120 1 47, 2023. Abstract Gaussian process regression underpins countless academic and industrial applications of machine learning and statistics , with maximum likelihood estimation routinely used to select appropriate parameters for the covariance kernel . However , it remains an open problem to establish the circumstances in which maximum likelihood estimation is well-posed , that is , when the predictions of the regression model are insensitive to small perturbations of the data . This

An Annotated Graph Model with Differential Degree Heterogeneity for Directed Networks

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us An Annotated Graph Model with Differential Degree Heterogeneity for Directed Networks Stefan Stein , Chenlei Leng 24(119 1 69, 2023. Abstract Directed networks are conveniently represented as graphs in which ordered edges encode interactions between vertices . Despite their wide availability , there is a shortage of statistical models amenable for inference , specially when contextual information and degree heterogeneity are present . This paper presents an annotated graph model with parameters explicitly accounting for these features . To overcome the curse of dimensionality due to modelling degree

DART: Distance Assisted Recursive Testing

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us DART : Distance Assisted Recursive Testing Xuechan Li , Anthony D . Sung , Jichun Xie 24(169 1 41, 2023. Abstract Multiple testing is a commonly used tool in modern data science . Sometimes , the hypotheses are embedded in a space the distances between the hypotheses reflect their co-null co- alternative patterns . Properly incorporating the distance information in testing will boost testing power . Hence , we developed a new multiple testing framework named Distance Assisted Recursive Testing DART DART features in joint artificial intelligence AI and statistics modeling . It has two stages . The

A Unified Framework for Optimization-Based Graph Coarsening

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us A Unified Framework for Optimization-Based Graph Coarsening Manoj Kumar , Anurag Sharma , Sandeep Kumar 24(118 1 50, 2023. Abstract Graph coarsening is a widely used dimensionality reduction technique for approaching large-scale graph machine-learning problems . Given a large graph , graph coarsening aims to learn a smaller-tractable graph while preserving the properties of the originally given graph . Graph data consist of node features and graph matrix e.g . adjacency and Laplacian The existing graph coarsening methods ignore the node features and rely solely on a graph matrix to simplify graphs .

Robust Methods for High-Dimensional Linear Learning

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Robust Methods for High-Dimensional Linear Learning Ibrahim Merad , Stéphane Gaïffas 24(165 1 44, 2023. Abstract We propose statistically robust and computationally efficient linear learning methods in the high-dimensional batch setting , where the number of features d$ may exceed the sample size n$ . We employ , in a generic learning setting , two algorithms depending on whether the considered loss function is gradient-Lipschitz or not . Then , we instantiate our framework on several applications including vanilla sparse , group-sparse and low-rank matrix recovery . This leads , for each application

A Framework and Benchmark for Deep Batch Active Learning for Regression

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us A Framework and Benchmark for Deep Batch Active Learning for Regression David Holzmüller , Viktor Zaverkin , Johannes Kästner , Ingo Steinwart 24(164 1 81, 2023. Abstract The acquisition of labels for supervised learning can be expensive . To improve the sample efficiency of neural network regression , we study active learning methods that adaptively select batches of unlabeled data for labeling . We present a framework for constructing such methods out of network-dependent base kernels , kernel transformations , and selection methods . Our framework encompasses many existing Bayesian methods based

Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Learning Good State and Action Representations for Markov Decision Process via Tensor Decomposition Chengzhuo Ni , Yaqi Duan , Munther Dahleh , Mengdi Wang , Anru R . Zhang 24(115 1 53, 2023. Abstract The transition kernel of a continuous-state-action Markov decision process MDP admits a natural tensor structure . This paper proposes a tensor-inspired unsupervised learning method to identify meaningful low-dimensional state and action representations from empirical trajectories . The method exploits the MDP's tensor structure by kernelization , importance sampling and low-Tucker-rank approximation .

Generalization Bounds for Adversarial Contrastive Learning

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Generalization Bounds for Adversarial Contrastive Learning Xin Zou , Weiwei Liu 24(114 1 54, 2023. Abstract Deep networks are well-known to be fragile to adversarial attacks , and adversarial training is one of the most popular methods used to train a robust model . To take advantage of unlabeled data , recent works have applied adversarial training to contrastive learning Adversarial Contrastive Learning ACL for short and obtain promising robust performance . However , the theory of ACL is not well understood . To fill this gap , we leverage the Rademacher omplexity to analyze the generalization

Flexible Model Aggregation for Quantile Regression

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Flexible Model Aggregation for Quantile Regression Rasool Fakoor , Taesup Kim , Jonas Mueller , Alexander J . Smola , Ryan J . Tibshirani 24(162 1 45, 2023. Abstract Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions , or to model a diverse population without being overly reductive . For instance , epidemiological forecasts , cost estimates , and revenue predictions all benefit from being able to quantify the range of possible values accurately . As such , many models have been developed for this problem over many years of

FLIP: A Utility Preserving Privacy Mechanism for Time Series

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us FLIP : A Utility Preserving Privacy Mechanism for Time Series Tucker McElroy , Anindya Roy , Gaurab Hore 24(111 1 29, 2023. Abstract Guaranteeing privacy in released data is an important goal for data-producing agencies . There has been extensive research on developing suitable privacy mechanisms in recent years . Particularly notable is the idea of noise addition with the guarantee of differential privacy . There are , however , concerns about compromising data utility when very stringent privacy mechanisms are applied . Such compromises can be quite stark in correlated data , such as time series

Dimensionless machine learning: Imposing exact units equivariance

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Dimensionless machine learning : Imposing exact units equivariance Soledad Villar , Weichi Yao , David W . Hogg , Ben Blum-Smith , Bianca Dumitrascu 24(109 1 32, 2023. Abstract Units equivariance or units covariance is the exact symmetry that follows from the requirement that relationships among measured quantities of physics relevance must obey self-consistent dimensional scalings . Here , we express this symmetry in terms of a non-compact group action , and we employ dimensional analysis and ideas from equivariant machine learning to provide a methodology for exactly units-equivariant machine

Concentration analysis of multivariate elliptic diffusions

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Concentration analysis of multivariate elliptic diffusions Lukas Trottner , Cathrine Aeckerle-Willems , Claudia Strauch 24(106 1 38, 2023. Abstract We prove concentration inequalities and associated PAC bounds for both continuous- and discrete-time additive functionals for possibly unbounded functions of multivariate , nonreversible diffusion processes . Our analysis relies on an approach via the Poisson equation allowing us to consider a very broad class of subexponentially ergodic , multivariate diffusion processes . These results add to existing concentration inequalities for additive functionals

Knowledge Hypergraph Embedding Meets Relational Algebra

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Knowledge Hypergraph Embedding Meets Relational Algebra Bahare Fatemi , Perouz Taslakian , David Vazquez , David Poole 24(105 1 34, 2023. Abstract Relational databases are a successful model for data storage , and rely on query languages for information retrieval . Most of these query languages are based on relational algebra , a mathematical formalization at the core of relational models . Knowledge graphs are flexible data storage structures that allow for knowledge completion using machine learning techniques . Knowledge hypergraphs generalize knowledge graphs by allowing multi-argument relations

Multivariate Soft Rank via Entropy-Regularized Optimal Transport: Sample Efficiency and Generative Modeling

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Multivariate Soft Rank via Entropy-Regularized Optimal Transport : Sample Efficiency and Generative Modeling Shoaib Bin Masud , Matthew Werenski , James M . Murphy , Shuchin Aeron 24(160 1 65, 2023. Abstract The framework of optimal transport has been leveraged to extend the notion of rank to the multivariate setting as corresponding to an optimal transport map , while preserving desirable properties of the resulting goodness-of-fit GoF statistics . In particular , the rank energy RE and rank maximum mean discrepancy RMMD are distribution-free under the null , exhibit high power in statistical

Sparse Training with Lipschitz Continuous Loss Functions and a Weighted Group L0-norm Constraint

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sparse Training with Lipschitz Continuous Loss Functions and a Weighted Group L0-norm Constraint Michael R . Metel 24(103 1 44, 2023. Abstract This paper is motivated by structured sparsity for deep neural network training . We study a weighted group l_0$-norm constraint , and present the projection and normal cone of this set . Using randomized smoothing , we develop zeroth and first-order algorithms for minimizing a Lipschitz continuous function constrained by any closed set which can be projected onto . Non-asymptotic convergence guarantees are proven in expectation for the proposed algorithms for

Integrating Random Effects in Deep Neural Networks

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Integrating Random Effects in Deep Neural Networks Giora Simchoni , Saharon Rosset 24(156 1 57, 2023. Abstract Modern approaches to supervised learning like deep neural networks DNNs typically implicitly assume that observed responses are statistically independent . In contrast , correlated data are prevalent in real-life large-scale applications , with typical sources of correlation including spatial , temporal and clustering structures . These correlations are either ignored by DNNs , or ad-hoc solutions are developed for specific use cases . We propose to use the mixed models framework to handle

Adaptive Data Depth via Multi-Armed Bandits

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Adaptive Data Depth via Multi-Armed Bandits Tavor Baharav , Tze Leung Lai 24(155 1 29, 2023. Abstract Data depth , introduced by Tukey 1975 is an important tool in data science , robust statistics , and computational geometry . One chief barrier to its broader practical utility is that many common measures of depth are computationally intensive , requiring on the order of n^d$ operations to exactly compute the depth of a single point within a data set of n$ points in d$-dimensional space . Often however , we are not directly interested in the absolute depths of the points , but rather in their

Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Adapting and Evaluating Influence-Estimation Methods for Gradient-Boosted Decision Trees Jonathan Brophy , Zayd Hammoudeh , Daniel Lowd 24(154 1 48, 2023. Abstract Influence estimation analyzes how changes to the training data can lead to different model predictions this analysis can help us better understand these predictions , the models making those predictions , and the data sets they are trained on . However , most influence-estimation techniques are designed for deep learning models with continuous parameters . Gradient-boosted decision trees GBDTs are a powerful and widely-used class of models

FedLab: A Flexible Federated Learning Framework

Updated: 2023-07-31 22:55:03

FedLab is a lightweight open-source framework for the simulation of federated learning. The design of FedLab focuses on federated learning algorithm effectiveness and communication efficiency. It allows customization on server optimization, client optimization, communication agreement, and communication compression. Also, FedLab is scalable in different deployment scenarios with different computation and communication resources. We hope FedLab could provide flexible APIs as well as reliable baseline implementations and relieve the burden of implementing novel approaches for researchers in the FL community. The source code, tutorial, and documentation can be found at https://github.com/SMILELab-FL/FedLab.

An Analysis of Robustness of Non-Lipschitz Networks

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us An Analysis of Robustness of Non-Lipschitz Networks Maria-Florina Balcan , Avrim Blum , Dravyansh Sharma , Hongyang Zhang 24(98 1 43, 2023. Abstract Despite significant advances , deep networks remain highly susceptible to adversarial attack . One fundamental challenge is that small input perturbations can often produce large movements in the network’s final-layer feature space . In this paper , we define an attack model that abstracts this challenge , to help understand its intrinsic properties . In our model , the adversary may move data an arbitrary distance in feature space but only in random

Selective inference for k-means clustering

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Selective inference for k-means clustering Yiqun T . Chen , Daniela M . Witten 24(152 1 41, 2023. Abstract We consider the problem of testing for a difference in means between clusters of observations identified via k-means clustering . In this setting , classical hypothesis tests lead to an inflated Type I error rate . In recent work , Gao et al . 2022 considered a related problem in the context of hierarchical clustering . Unfortunately , their solution is highly-tailored to the context of hierarchical clustering , and thus cannot be applied in the setting of k-means clustering . In this paper , we

Generalization error bounds for multiclass sparse linear classifiers

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Generalization error bounds for multiclass sparse linear classifiers Tomer Levy , Felix Abramovich 24(151 1 35, 2023. Abstract We consider high-dimensional multiclass classification by sparse multinomial logistic regression . Unlike binary classification , in the multiclass setup one can think about an entire spectrum of possible notions of sparsity associated with different structural assumptions on the regression coefficients matrix . We propose a computationally feasible feature selection procedure based on penalized maximum likelihood with convex penalties capturing a specific type of sparsity at

Fitting Autoregressive Graph Generative Models through Maximum Likelihood Estimation

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Fitting Autoregressive Graph Generative Models through Maximum Likelihood Estimation Xu Han , Xiaohui Chen , Francisco J . R . Ruiz , Li-Ping Liu 24(97 1 30, 2023. Abstract We consider the problem of fitting autoregressive graph generative models via maximum likelihood estimation MLE MLE is intractable for graph autoregressive models because the nodes in a graph can be arbitrarily reordered thus the exact likelihood involves a sum over all possible node orders leading to the same graph . In this work , we fit the graph models by maximizing a variational bound , which is built by first deriving the

Statistical Inference for Noisy Incomplete Binary Matrix

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Statistical Inference for Noisy Incomplete Binary Matrix Yunxiao Chen , Chengcheng Li , Jing Ouyang , Gongjun Xu 24(95 1 66, 2023. Abstract We consider the statistical inference for noisy incomplete binary or 1-bit matrix . Despite the importance of uncertainty quantification to matrix completion , most of the categorical matrix completion literature focuses on point estimation and prediction . This paper moves one step further toward statistical inference for binary matrix completion . Under a popular nonlinear factor analysis model , we obtain a point estimator and derive its asymptotic normality .

Faith-Shap: The Faithful Shapley Interaction Index

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Faith-Shap : The Faithful Shapley Interaction Index Che-Ping Tsai , Chih-Kuan Yeh , Pradeep Ravikumar 24(94 1 42, 2023. Abstract Shapley values , which were originally designed to assign attributions to individual players in coalition games , have become a commonly used approach in explainable machine learning to provide attributions to input features for black-box machine learning models . A key attraction of Shapley values is that they uniquely satisfy a very natural set of axiomatic properties . However , extending the Shapley value to assigning attributions to interactions rather than

MALib: A Parallel Framework for Population-based Multi-agent Reinforcement Learning

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us MALib : A Parallel Framework for Population-based Multi-agent Reinforcement Learning Ming Zhou , Ziyu Wan , Hanjing Wang , Muning Wen , Runzhe Wu , Ying Wen , Yaodong Yang , Yong Yu , Jun Wang , Weinan Zhang 24(150 1 12, 2023. Abstract Population-based multi-agent reinforcement learning PB-MARL encompasses a range of methods that merge dynamic population selection with multi-agent reinforcement learning algorithms MARL While PB-MARL has demonstrated notable achievements in complex multi-agent tasks , its sequential execution is plagued by low computational efficiency due to the diversity in

Decentralized Learning: Theoretical Optimality and Practical Improvements

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Decentralized Learning : Theoretical Optimality and Practical Improvements Yucheng Lu , Christopher De Sa 24(93 1 62, 2023. Abstract Decentralization is a promising method of scaling up parallel machine learning systems . In this paper , we provide a tight lower bound on the iteration complexity for such methods in a stochastic non-convex setting . Our lower bound reveals a theoretical gap in known convergence rates of many existing decentralized training algorithms , such as D-PSGD . We prove by construction this lower bound is tight and achievable . Motivated by our insights , we further propose

Non-Asymptotic Guarantees for Robust Statistical Learning under Infinite Variance Assumption

Updated: 2023-07-31 22:55:03

There has been a surge of interest in developing robust estimators for models with heavy-tailed and bounded variance data in statistics and machine learning, while few works impose unbounded variance. This paper proposes two types of robust estimators, the ridge log-truncated M-estimator and the elastic net log-truncated M-estimator. The first estimator is applied to convex regressions such as quantile regression and generalized linear models, while the other one is applied to high dimensional non-convex learning problems such as regressions via deep neural networks. Simulations and real data analysis demonstrate the robustness of log-truncated estimations over standard estimations.

Outlier-Robust Subsampling Techniques for Persistent Homology

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Outlier-Robust Subsampling Techniques for Persistent Homology Bernadette J . Stolz 24(90 1 35, 2023. Abstract In recent years , persistent homology has been successfully applied to real-world data in many different settings . Despite significant computational advances , persistent homology algorithms do not yet scale to large datasets preventing interesting applications . One approach to address computational issues posed by persistent homology is to select a set of landmarks by subsampling from the data . Currently , these landmark points are chosen either at random or using the maxmin algorithm .

Neural Operator: Learning Maps Between Function Spaces With Applications to PDEs

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Neural Operator : Learning Maps Between Function Spaces With Applications to PDEs Nikola Kovachki , Zongyi Li , Burigede Liu , Kamyar Azizzadenesheli , Kaushik Bhattacharya , Andrew Stuart , Anima Anandkumar 24(89 1 97, 2023. Abstract The classical development of neural networks has primarily focused on learning mappings between finite dimensional Euclidean spaces or finite sets . We propose a generalization of neural networks to learn operators , termed neural operators , that map between infinite dimensional function spaces . We formulate the neural operator as a composition of linear integral

Controlling Wasserstein Distances by Kernel Norms with Application to Compressive Statistical Learning

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Controlling Wasserstein Distances by Kernel Norms with Application to Compressive Statistical Learning Titouan Vayer , Rémi Gribonval 24(149 1 51, 2023. Abstract Comparing probability distributions is at the crux of many machine learning algorithms . Maximum Mean Discrepancies MMD and Wasserstein distances are two classes of distances between probability distributions that have attracted abundant attention in past years . This paper establishes some conditions under which the Wasserstein distance can be controlled by MMD norms . Our work is motivated by the compressive statistical learning CSL theory

Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Dimension-Grouped Mixed Membership Models for Multivariate Categorical Data Yuqi Gu , Elena E . Erosheva , Gongjun Xu , David B . Dunson 24(88 1 49, 2023. Abstract Mixed Membership Models MMMs are a popular family of latent structure models for complex multivariate data . Instead of forcing each subject to belong to a single cluster , MMMs incorporate a vector of subject-specific weights characterizing partial membership across clusters . With this flexibility come challenges in uniquely identifying , estimating , and interpreting the parameters . In this article , we propose a new class of

Gaussian Processes with Errors in Variables: Theory and Computation

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Gaussian Processes with Errors in Variables : Theory and Computation Shuang Zhou , Debdeep Pati , Tianying Wang , Yun Yang , Raymond J . Carroll 24(87 1 53, 2023. Abstract Covariate measurement error in nonparametric regression is a common problem in nutritional epidemiology and geostatistics , and other fields . Over the last two decades , this problem has received substantial attention in the frequentist literature . Bayesian approaches for handling measurement error have only been explored recently and are surprisingly successful , although there still is a lack of a proper theoretical

Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning Bokun Wang , Zhuoning Yuan , Yiming Ying , Tianbao Yang 24(145 1 46, 2023. Abstract In recent years , model-agnostic meta-learning MAML has become a popular research area . However , the stochastic optimization of MAML is still underdeveloped . Existing MAML algorithms rely on the episode” idea by sampling a few tasks and data points to update the meta-model at each iteration . Nonetheless , these algorithms either fail to guarantee convergence with a constant mini-batch size or require processing a

Bayes-Newton Methods for Approximate Bayesian Inference with PSD Guarantees

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Bayes-Newton Methods for Approximate Bayesian Inference with PSD Guarantees William J . Wilkinson , Simo Särkkä , Arno Solin 24(83 1 50, 2023. Abstract We formulate natural gradient variational inference VI expectation propagation EP and posterior linearisation PL as extensions of Newton's method for optimising the parameters of a Bayesian posterior distribution . This viewpoint explicitly casts inference algorithms under the framework of numerical optimisation . We show that common approximations to Newton's method from the optimisation literature , namely Gauss-Newton and quasi-Newton methods e.g .

Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Escaping The Curse of Dimensionality in Bayesian Model-Based Clustering Noirrit Kiran Chandra , Antonio Canale , David B . Dunson 24(144 1 42, 2023. Abstract Bayesian mixture models are widely used for clustering of high-dimensional data with appropriate uncertainty quantification . However , as the dimension of the observations increases , posterior inference often tends to favor too many or too few clusters . This article explains this behavior by studying the random partition posterior in a non-standard setting with a fixed sample size and increasing data dimensionality . We provide conditions

Large sample spectral analysis of graph-based multi-manifold clustering

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Large sample spectral analysis of graph-based multi-manifold clustering Nicolas Garcia Trillos , Pengfei He , Chenghui Li 24(143 1 71, 2023. Abstract In this work we study statistical properties of graph-based algorithms for multi-manifold clustering MMC In MMC the goal is to retrieve the multi-manifold structure underlying a given Euclidean data set when this one is assumed to be obtained by sampling a distribution on a union of manifolds M M_1 cup dots cup M_N$ that may intersect with each other and that may have different dimensions . We investigate sufficient conditions that similarity graphs on

Fast Online Changepoint Detection via Functional Pruning CUSUM Statistics

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Fast Online Changepoint Detection via Functional Pruning CUSUM Statistics Gaetano Romano , Idris A . Eckley , Paul Fearnhead , Guillem Rigaill 24(81 1 36, 2023. Abstract Many modern applications of online changepoint detection require the ability to process high-frequency observations , sometimes with limited available computational resources . Online algorithms for detecting a change in mean often involve using a moving window , or specifying the expected size of change . Such choices affect which changes the algorithms have most power to detect . We introduce an algorithm , Functional Online CuSUM

Approximate Post-Selective Inference for Regression with the Group LASSO

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Approximate Post-Selective Inference for Regression with the Group LASSO Snigdha Panigrahi , Peter W MacDonald , Daniel Kessler 24(79 1 49, 2023. Abstract After selection with the Group LASSO or generalized variants such as the overlapping , sparse , or standardized Group LASSO inference for the selected parameters is unreliable in the absence of adjustments for selection bias . In the penalized Gaussian regression setup , existing approaches provide adjustments for selection events that can be expressed as linear inequalities in the data variables . Such a representation , however , fails to hold

On Tilted Losses in Machine Learning: Theory and Applications

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us On Tilted Losses in Machine Learning : Theory and Applications Tian Li , Ahmad Beirami , Maziar Sanjabi , Virginia Smith 24(142 1 79, 2023. Abstract Exponential tilting is a technique commonly used in fields such as statistics , probability , information theory , and optimization to create parametric distribution shifts . Despite its prevalence in related fields , tilting has not seen widespread use in machine learning . In this work , we aim to bridge this gap by exploring the use of tilting in risk minimization . We study a simple extension to ERM---tilted empirical risk minimization TERM which

A Randomized Subspace-based Approach for Dimensionality Reduction and Important Variable Selection

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us A Randomized Subspace-based Approach for Dimensionality Reduction and Important Variable Selection Di Bo , Hoon Hwangbo , Vinit Sharma , Corey Arndt , Stephanie TerMaath 24(76 1 31, 2023. Abstract An analysis of high-dimensional data can offer a detailed description of a system but is often challenged by the curse of dimensionality . General dimensionality reduction techniques can alleviate such difficulty by extracting a few important features , but they are limited due to the lack of interpretability and connectivity to actual decision making associated with each physical variable . Variable

Intrinsic Persistent Homology via Density-based Metric Learning

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Intrinsic Persistent Homology via Density-based Metric Learning Ximena Fernández , Eugenio Borghini , Gabriel Mindlin , Pablo Groisman 24(75 1 42, 2023. Abstract We address the problem of estimating topological features from data in high dimensional Euclidean spaces under the manifold assumption . Our approach is based on the computation of persistent homology of the space of data points endowed with a sample metric known as Fermat distance . We prove that such metric space converges almost surely to the manifold itself endowed with an intrinsic metric that accounts for both the geometry of the

Asymptotics of Network Embeddings Learned via Subsampling

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Asymptotics of Network Embeddings Learned via Subsampling Andrew Davison , Morgane Austern 24(138 1 120, 2023. Abstract Network data are ubiquitous in modern machine learning , with tasks of interest including node classification , node clustering and link prediction . A frequent approach begins by learning an Euclidean embedding of the network , to which algorithms developed for vector-valued data are applied . For large networks , embeddings are learned using stochastic gradient methods where the sub-sampling scheme can be freely chosen . Despite the strong empirical performance of such methods ,

Dimension Reduction in Contextual Online Learning via Nonparametric Variable Selection

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Dimension Reduction in Contextual Online Learning via Nonparametric Variable Selection Wenhao Li , Ningyuan Chen , L . Jeff Hong 24(136 1 84, 2023. Abstract We consider a contextual online learning multi-armed bandit problem with high-dimensional covariate x$ and decision y$ . The reward function to learn , f(x,y does not have a particular parametric form . The literature has shown that the optimal regret is tilde{O T^{(d_x d_y 1 d_x d_y 2 where d_x$ and d_y$ are the dimensions of x$ and y$ , and thus it suffers from the curse of dimensionality . In many applications , only a small subset of

Sparse GCA and Thresholded Gradient Descent

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Sparse GCA and Thresholded Gradient Descent Sheng Gao , Zongming Ma 24(135 1 61, 2023. Abstract Generalized correlation analysis GCA is concerned with uncovering linear relationships across multiple data sets . It generalizes canonical correlation analysis that is designed for two data sets . We study sparse GCA when there are potentially multiple leading generalized correlation tuples in data that are of interest and the loading matrix has a small number of nonzero rows . It includes sparse CCA and sparse PCA of correlation matrices as special cases . We first formulate sparse GCA as a generalized

MARS: A Second-Order Reduction Algorithm for High-Dimensional Sparse Precision Matrices Estimation

Updated: 2023-07-31 22:55:03

: Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us MARS : A Second-Order Reduction Algorithm for High-Dimensional Sparse Precision Matrices Estimation Qian Li , Binyan Jiang , Defeng Sun 24(134 1 44, 2023. Abstract Estimation of the precision matrix or inverse covariance matrix is of great importance in statistical data analysis and machine learning . However , as the number of parameters scales quadratically with the dimension p$ , the computation becomes very challenging when p$ is large . In this paper , we propose an adaptive sieving reduction algorithm to generate a solution path for the estimation of precision matrices under the ell_1$

Exploiting Discovered Regression Discontinuities to Debias Conditioned-on-observable Estimators

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Exploiting Discovered Regression Discontinuities to Debias Conditioned-on-observable Estimators Benjamin Jakubowski , Sriram Somanchi , Edward McFowland III , Daniel B . Neill 24(133 1 57, 2023. Abstract Regression discontinuity RD designs are widely used to estimate causal effects in the absence of a randomized experiment . However , standard approaches to RD analysis face two significant limitations . First , they require a priori knowledge of discontinuities in treatment . Second , they yield doubly-local treatment effect estimates , and fail to provide more general causal effect estimates away

Combinatorial Optimization and Reasoning with Graph Neural Networks

Updated: 2023-07-31 22:55:03

Combinatorial optimization is a well-established area in operations research and computer science. Until recently, its methods have focused on solving problem instances in isolation, ignoring that they often stem from related data distributions in practice. However, recent years have seen a surge of interest in using machine learning, especially graph neural networks, as a key building block for combinatorial tasks, either directly as solvers or by enhancing exact solvers. The inductive bias of GNNs effectively encodes combinatorial and relational input due to their invariance to permutations and awareness of input sparsity. This paper presents a conceptual review of recent key advancements in this emerging field, aiming at optimization and machine learning researchers.

An Eigenmodel for Dynamic Multilayer Networks

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us An Eigenmodel for Dynamic Multilayer Networks Joshua Daniel Loyal , Yuguo Chen 24(128 1 69, 2023. Abstract Dynamic multilayer networks frequently represent the structure of multiple co-evolving relations however , statistical models are not well-developed for this prevalent network type . Here , we propose a new latent space model for dynamic multilayer networks . The key feature of our model is its ability to identify common time-varying structures shared by all layers while also accounting for layer-wise variation and degree heterogeneity . We establish the identifiability of the model's parameters

Graph Clustering with Graph Neural Networks

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Graph Clustering with Graph Neural Networks Anton Tsitsulin , John Palowitch , Bryan Perozzi , Emmanuel Müller 24(127 1 21, 2023. Abstract Graph Neural Networks GNNs have achieved state-of-the-art results on many graph analysis tasks such as node classification and link prediction . However , important unsupervised problems on graphs , such as graph clustering , have proved more resistant to advances in GNNs . Graph clustering has the same overall goal as node pooling in GNNs—does this mean that GNN pooling methods do a good job at clustering graphs Surprisingly , the answer is no—current GNN pooling

Statistical Robustness of Empirical Risks in Machine Learning

Updated: 2023-07-31 22:55:03

Home Page Papers Submissions News Editorial Board Special Issues Open Source Software Proceedings PMLR Data DMLR Transactions TMLR Search Statistics Login Frequently Asked Questions Contact Us Statistical Robustness of Empirical Risks in Machine Learning Shaoyan Guo , Huifu Xu , Liwei Zhang 24(125 1 38, 2023. Abstract This paper studies convergence of empirical risks in reproducing kernel Hilbert spaces RKHS A conventional assumption in the existing research is that empirical training data are generated by the unknown true probability distribution but this may not be satisfied in some practical circumstances . Consequently the existing convergence results may not provide a guarantee as to whether the empirical risks are reliable or not when the data are potentially corrupted generated by a

Comparing home run in distance different stadiums

Updated: 2023-07-31 07:31:00

Membership Courses Tutorials Projects Newsletter Become a Member Log in Comparing home run in distance different stadiums July 31, 2023 Topic Infographics baseball home run Washington Post In Major League Baseball , a player hits a home run when the ball flies over the outfield fence . However , the distance between the hitter and the outfield fence varies by stadium , which means a home run in one stadium might not be far enough for a home run in a different stadium . For The Washington Post , Kevin Schaul made a thing that lets you compare stadiums Related Investors bought up a lot of houses in 2021 Senators and Reps whose voting doesn’t quite match the constituent’s Issues Democratic hopefuls are talking about on social media Become a . member Support an independent site . Make great

HOW TO PROPERLY CARE FOR YOUR EYES WHEN WEARING CONTACT LENSES

Updated: 2023-07-28 08:41:01

This infographic offers eye care advice for proper use and maintenance of contact lenses, ensuring eye safety and health. Learn effective techniques for optimal use. Source: https://www.lensworld.com.au/blog/how-to-properly-care-for-your-eyes-when-wearing-contact-lenses-4-top-tips/

Barbie and Oppenheimer themes for charts in R

Updated: 2023-07-28 07:03:59

Membership Courses Tutorials Projects Newsletter Become a Member Log in Barbie and Oppenheimer themes for charts in R July 28, 2023 Topic Software Barbie ggplot Matthew JanÃ© Oppenheimer R theme Matthew JanÃ© made a small R package called Theme Park which is meant to supply movie-based themes for ggplot . For now , it just has Barbie and Oppenheimer . themes Related xkcd-style charts in R , JavaScript , and Python Movie quotes as charts , poster edition Famous Movie Quotes as Charts Become a . member Support an independent site . Make great charts . See what you get Projects by FlowingData See All Data Underload 18 Sleep Schedule According to WebMD , for 1- to 4-week-olds : Since newborns do Social Media Usage by Age Here’s the breakdown by age for American adults in 2021, based on data

✚ Visualization Tools and Learning Resources, July 2023 Roundup

Updated: 2023-07-27 18:30:47

, Membership Courses Tutorials Projects Newsletter Become a Member Log in Members Only Visualization Tools and Learning Resources , July 2023 Roundup July 27, 2023 Topic The Process roundup Welcome to The Process where we look closer at how the charts get made . This is issue 249. Thanks for supporting this small visualization corner of the internet . I’m Nathan Yau , and throughout the month I collect tools and resources to help you make better charts . This is the good stuff for . July To access this issue of The Process , you must be a . member If you are already a member , log in here See What You Get The Process is a weekly newsletter on how visualization tools , rules , and guidelines work in practice . I publish every Thursday . Get it in your inbox or read it on FlowingData . You

John Snow’s cholera map, an animated version

Updated: 2023-07-26 07:58:03

Sarah Bell made an animated version of John Snow’s classic map from 1854.…Tags: animation, cholera, John Snow, Sarah Bell

Understanding the SVG path element, a visual guide

Updated: 2023-07-20 07:51:51

, Membership Courses Tutorials Projects Newsletter Become a Member Log in Understanding the SVG path element , a visual guide July 20, 2023 Topic Coding Nanda Syahrasyad paths SVG The SVG path element can be useful for drawing regular and irregular shapes . However , if you just look at how a path is defined , it’s not entirely clear how to use it . Nanda Syahrasyad made a visual guide to help you figure it out Related Visual Guide to the Financial Crisis Visual Guide to General Motors’ Financial Woes Visual guide for the fires in Australia Become a . member Support an independent site . Make great charts . See what you get Projects by FlowingData See All Mapping the Spread of Obesity A look at the rise for each state over three decades , for men and . women Who Still Smokes Two decades

How to Visualize Data with Pareto Charts Using JavaScript

Updated: 2023-07-11 06:26:17

Welcome to this step-by-step tutorial that will empower you to create an interactive Pareto chart using JavaScript that will look nice on any device and in any browser! A Pareto chart is a captivating graphical combo representation that showcases individual values through descending bars, while a line graph illustrates the cumulative total. It is a […] The post How to Visualize Data with Pareto Charts Using JavaScript appeared first on AnyChart News.

Freiheit Gruppe Uses AnyChart JS to Visually Organize Berliners’ Ideas for Berlin University Alliance

Updated: 2023-07-05 07:14:07

Sales : 1 888 845-1211 USA or 44 20 7193 9444 Europe customer login Toggle navigation Products AnyChart AnyStock AnyMap AnyGantt Mobile Qlik Extension Features Resources Business Solutions Technical Integrations Chartopedia Tutorials Support Company About Us Customers Success Stories More Testimonials News Download Buy Now Search News Â» Success stories Â» Freiheit Gruppe Uses AnyChart JS to Visually Organize Berlinersâ Ideas for Berlin University Alliance Freiheit Gruppe Uses AnyChart JS to Visually Organize Berlinersâ Ideas for Berlin University Alliance July 5th , 2023 by AnyChart Team Data visualization is a game-changer when it comes to exploring and making sense of data . And here at AnyChart we’re passionate about making the development of interactive charts a breeze . Our

Data Visualization

Exploring ways to display data

Current Feed Items | Previous Months ItemsJun 2023 | May 2023 | Apr 2023 | Mar 2023 | Feb 2023 | Jan 2023

Current Feed Items | Previous Months Items

Get Feed

Sources

53 - JMLR

5 - FlowingData

2 - AnyChart News

1 - Infographics Submission Hub

Current Feed Items | Previous Months Items
Jun 2023 | May 2023 | Apr 2023 | Mar 2023 | Feb 2023 | Jan 2023